---
title: "Supplementary Material"
subtitle: "Validating V-Party's Party Organizational Items"
output: 
  html_document:
    code_folding: hide
    fig_height: 6
    fig_width: 9
    theme: cosmo
    toc: yes
    toc_depth: 1
    toc_float: yes
---

</br>

*Supplementary information for "Party Organizations Around the Globe. Introducing the Varieties of Party Identity and Organization Dataset (V-Party)"*

*Version: October 2021*

</br>

In this supplement we report comprehensive descriptive and summary statistics for V-Party's organizational items, present the factor analysis used to identify three dimensions of party organizational features -- *organizational extensiveness*, *intra-party power concentration* and *elite cohesion* -- and report the regression results of the survival analysis. The supplement consists of five parts:

1.  In **Part One** we describe six items from V-Party capturing different aspects of party organizations.
2.  In **Part Two** we conduct an exploratory factor analysis for uncovering any dimensionality in the data and operationalize three dimensions of party organizational features.
3.  In **Part Three** we present descriptive insights for the three main dimensions.
4.  In **Part Four** we cross-validate the aggregated indices and single items with extant data on party organizations.
5.  In **Part Five** we test the impact of party organizational features on party survival for evaluating the construct validity of V-Party data.

```{r message=FALSE, warning=FALSE, include=FALSE}

# clean env
rm(list = ls())

    # load packages
library(tidyverse)
library(fuzzyjoin)
library(summarytools)
library(skimr)
library(kableExtra)
library(grid)
library(gridExtra)
library(ggridges)
library(ggpubr)
library(GGally)
library(psych)
library(lme4)
library(sandwich)
library(lmtest)
library(texreg)
library(export)

    # load v-party
load("../data/vparty_for_partyorga.Rdata")

    # omit closed autocracies for the validation report
df <- df %>% filter(v2x_regime > 0)

    # set skim "defaults"
my_skim <- skim_with(base = sfl(n=n_complete), numeric = sfl(hist=NULL, n_complete=NULL, n_missing=NULL))
options(digits = 3)

    # dataset information
dataset_info <-
    df %>%
    filter(!is.na(orgext) & !is.na(powercon) & !is.na(cohesion)) %>%
    mutate(
        year_min = min(year),
        year_max = max(year),
        elections = n_distinct(election_id),
        countries = n_distinct(country_id),
        parties = n_distinct(v2paid)) %>%
    select(countries, parties, elections, year_min, year_max) %>%
    distinct()
```

</br>

------------------------------------------------------------------------

# Part One - Items

------------------------------------------------------------------------

</br>

The following table lists six items from V-Party capturing different aspects of party organizations, their variable names, the question and response categories. For all items, experts were asked to place the focal party on a five-point scale for every election year. The answers were then aggregated applying a Bayesian item response theory measurement model according to V-Dem's standard methodology (Pemstein et al. 2019). Below we show summary statistics for each item and its distribution. *We exclude closed autocracies where the executive branch is not subject to elections and observations that have less than three coders per item.* This is slightly less strict than advised in the "Cautionary Notes" (Lührmann et al. 2020, 5). Yet, as our report shows this further underlines the validity of V-Party data. At times, we distinguish between regime types in this report using the Regimes of the World measure (Lührmann, Tannenberg, and Lindberg 2018). The overall sample for this supplement thus encompasses **`r dataset_info$parties` parties** and **`r dataset_info$elections` elections** in **`r dataset_info$countries` countries** from **`r dataset_info$year_min`** to **`r dataset_info$year_max`**.

</br>

<table class="table table-striped table-hover">
<thead>
  <tr>
    <th>Item</th>
    <th>Variable</th>
    <th>Question</th>
    <th>Response categories</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td>Local party offices</td>
    <td>v2palocoff</td>
    <td>Does this party maintain permanent offices that operate outside of election campaigns at the local or municipal-level?</td>
    <td>0: The party does not have permanent local offices.<br>1: The party has permanent local offices in few municipalities.<br>2: The party has permanent local offices in some municipalities.<br>3: The party has permanent local offices in most municipalities.<br>4: The party has permanent local offices in all or almost all municipalities.</td>
  </tr>
  <tr>
    <td>Active community presence</td>
    <td>v2paactcom</td>
    <td>To what degree are party activists and personnel permanently active in local communities?</td>
    <td>0: There is negligible permanent presence of party activists and personnel in local communities.<br>1: There is minor permanent presence of party activists and personnel in local communities.<br>2: There is noticeable permanent presence of party activists and personnel in local communities.<br>3: There is significant permanent presence of party activists and personnel in local communities.<br>4: There is widespread permanent presence of party activists and personnel in local communities.</td>
  </tr>
  <tr>
    <td>Affiliate organizations</td>
    <td>v2pasoctie</td>
    <td>To what extent does this party maintain ties to prominent social organizations?</td>
    <td>0: The party does not maintain ties to any prominent social organization.<br>1: The party maintains weak ties to prominent social organizations.<br>2: The party maintains moderate ties to prominent social organizations.<br>3: The party maintains strong ties to prominent social organizations.<br>4: The party controls prominent social organizations.<br></td>
  </tr>
  <tr>
    <td>Candidate nomination</td>
    <td>v2panom</td>
    <td>Which of the following options best describes the process by which the party decides on candidates for the national legislative elections?</td>
    <td>0: The party leader unilaterally decides on which candidates will run for the party in national legislative elections.<br>1: The national party leadership (i.e. an executive committee) collectively decides which candidates will run for the party in national legislative elections.<br>2: Delegates of local/regional organizations decide which candidates will run for the party in national legislative elections.<br>3: All party members decide on which candidates will run for the party in national legislative elections in primaries/caucuses.<br>4: All registered voters decide on which candidates will run for the party in national legislative elections in primaries/caucuses.</td>
  </tr>
  <tr>
    <td>Internal cohesion</td>
    <td>v2padisa</td>
    <td>To what extent do the elites in this party display disagreement over party strategies?</td>
    <td>0: Party elites display almost complete disagreement over party strategies and many party elites have left the party.<br>1: Party elites display a high level of visible disagreement over party strategies and some of them have left the party.<br>2: Party elites display some visible disagreement over party strategies, but none of them have left the party.<br>3: Party elites display negligible visible disagreement over party strategies.<br>4: Party elites display virtually no visible disagreement over party strategies. </td>
  </tr>
  <tr>
    <td>Personalization of party</td>
    <td>v2paind</td>
    <td>To what extent is this party a vehicle for the personal will and priorities of one individual leader?</td>
    <td>0: The party is not focused on the personal will and priorities of one individual leader.<br>1: The party is occasionally focused on the personal will and priorities of one individual party leader.<br>2: The party is somewhat focused on the personal will and priorities of one individual party leader.<br>3: The party is mainly focused on the personal will and priorities of one individual party leader.<br>4: The party is solely focused on the personal will and priorities of one individual party leader.</td>
  </tr>
</tbody>
</table>

```{r, message=FALSE, warning=FALSE}

    # Descriptive stats
df %>%
  select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  
```

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Histogram
hist1 <- df %>% ggplot(aes(x=v2palocoff)) + geom_density(aes(fill = factor(v2x_regime_lab)), alpha=0.3, color="black") + geom_rug(sides = "b", alpha = 0.05) + theme_minimal() + theme(legend.position = "bottom") + scale_color_brewer(palette="Set1") + labs(fill = "Regimes of the World (RoW)")
hist2 <- df %>% ggplot(aes(x=v2paactcom)) + geom_density(aes(fill = factor(v2x_regime_lab)), alpha=0.3, color="black") + geom_rug(sides = "b", alpha = 0.05) + theme_minimal() + theme(legend.position = "none") + scale_color_brewer(palette="Set1")
hist3 <- df %>% ggplot(aes(x=v2pasoctie)) + geom_density(aes(fill = factor(v2x_regime_lab)), alpha=0.3, color="black") + geom_rug(sides = "b", alpha = 0.05) + theme_minimal() + theme(legend.position = "none") + scale_color_brewer(palette="Set1")
hist4 <- df %>% ggplot(aes(x=v2panom))    + geom_density(aes(fill = factor(v2x_regime_lab)), alpha=0.3, color="black") + geom_rug(sides = "b", alpha = 0.05) + theme_minimal() + theme(legend.position = "none") + scale_color_brewer(palette="Set1")
hist5 <- df %>% ggplot(aes(x=v2padisa))   + geom_density(aes(fill = factor(v2x_regime_lab)), alpha=0.3, color="black") + geom_rug(sides = "b", alpha = 0.05) + theme_minimal() + theme(legend.position = "none") + scale_color_brewer(palette="Set1")
hist6 <- df %>% ggplot(aes(x=v2paind))    + geom_density(aes(fill = factor(v2x_regime_lab)), alpha=0.3, color="black") + geom_rug(sides = "b", alpha = 0.05) + theme_minimal() + theme(legend.position = "none") + scale_color_brewer(palette="Set1")

ggarrange(hist1, hist2, hist3, hist4, hist5, hist6, ncol = 3, nrow = 2, common.legend = TRUE, legend="bottom")
```

```{r, message=FALSE, warning=FALSE, fig.align='center'}

  # scatterplots
scat1 <-
  df %>%
  ggplot(aes(year, v2palocoff)) + 
  geom_point(aes(colour = factor(v2x_regime_lab)), alpha=0.1) + 
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) + 
  theme_minimal() + theme(legend.position = "bottom") + 
  scale_color_brewer(palette="Set1") +
  labs(colour = "Regimes of the World (RoW)")


scat2 <-
  df %>%
  ggplot(aes(year, v2paactcom)) + 
  geom_point(aes(colour = factor(v2x_regime_lab)), alpha=0.1) + 
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat3 <-
  df %>%
  ggplot(aes(year, v2pasoctie)) + 
  geom_point(aes(colour = factor(v2x_regime_lab)), alpha=0.1) + 
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat4 <-
  df %>%
  ggplot(aes(year, v2panom)) + 
  geom_point(aes(colour = factor(v2x_regime_lab)), alpha=0.1) + 
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat5 <-
  df %>%
  ggplot(aes(year, v2padisa)) + 
  geom_point(aes(colour = factor(v2x_regime_lab)), alpha=0.1) + 
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat6 <-
  df %>%
  ggplot(aes(year, v2paind)) + 
  geom_point(aes(colour = factor(v2x_regime_lab)), alpha=0.1) + 
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

ggarrange(scat1, scat2, scat3, scat4, scat5, scat6, ncol = 3, nrow = 2, common.legend = TRUE, legend="bottom")
```

## Regional development of party organizational attributes {.tabset .tabset-pills}

Below we show the development over time split by six socio-political regions. Some notable trends stick out:

-   In Eastern Europe and Central Asia there is a break after the collapse of communist single-party regimes, especially regarding the organizational extensiveness and reach. In the recent past parties show a slight trend towards more exclusive nomination procedures. At the same time, the personalization of party politics increases but there is less disagreement among party elites. In general, parties have more local offices than the global average.
-   In Latin America and the Caribbean, ties to social organizations erode in recent years but candidate nomination procedures become more inclusive. While personalization remains quite stable, elites are slightly less cohesive.
-   In the Middle East and Northern Africa, there is a slight trend towards less local offices. Candidate nomination is a quite exclusive process though, way below the global average. Much like in Latin America, party personalization remains quite stable and elites show more disagreement.
-   In Sub-Saharan Africa, parties tend to opt for more inclusive nomination procedures. At the same time social ties erode and there is more struggle among party elites over party strategies.
-   In Western Europe and North America, there is a slight trend towards more personalized parties. From a global perspective, personalization is, however, still way below other regions. Ties to social organizations also start eroding despite a high level of active local communities; party elites in turn show more disagreement.
-   In Asia and Pacific, parties do not show a clear trend, although parties are becoming slightly more inclusive regarding candidate nomination and show a slight trend towards more disagreement among party elites.

### Eastern Europe and Central Asia

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 1) %>% 
  select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # scatterplots
scat1 <- df %>% filter(e_regionpol_6C == 1) %>% ggplot(aes(year, v2palocoff)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat2 <- df %>% filter(e_regionpol_6C == 1) %>% ggplot(aes(year, v2paactcom)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat3 <- df %>% filter(e_regionpol_6C == 1) %>% ggplot(aes(year, v2pasoctie)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat4 <- df %>% filter(e_regionpol_6C == 1) %>% ggplot(aes(year, v2panom))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat5 <- df %>% filter(e_regionpol_6C == 1) %>% ggplot(aes(year, v2padisa))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat6 <- df %>% filter(e_regionpol_6C == 1) %>% ggplot(aes(year, v2paind))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()

grid.arrange(scat1, scat2, scat3, scat4, scat5, scat6, ncol = 3, nrow = 2)
```

### Latin America and the Caribbean

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 2) %>% 
  select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # scatterplots
scat1 <- df %>% filter(e_regionpol_6C == 2) %>% ggplot(aes(year, v2palocoff)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat2 <- df %>% filter(e_regionpol_6C == 2) %>% ggplot(aes(year, v2paactcom)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat3 <- df %>% filter(e_regionpol_6C == 2) %>% ggplot(aes(year, v2pasoctie)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat4 <- df %>% filter(e_regionpol_6C == 2) %>% ggplot(aes(year, v2panom))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat5 <- df %>% filter(e_regionpol_6C == 2) %>% ggplot(aes(year, v2padisa))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat6 <- df %>% filter(e_regionpol_6C == 2) %>% ggplot(aes(year, v2paind))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()

grid.arrange(scat1, scat2, scat3, scat4, scat5, scat6, ncol = 3, nrow = 2)
```

### The Middle East and Northern Africa

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 3) %>% 
  select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # scatterplots
scat1 <- df %>% filter(e_regionpol_6C == 3) %>% ggplot(aes(year, v2palocoff)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat2 <- df %>% filter(e_regionpol_6C == 3) %>% ggplot(aes(year, v2paactcom)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat3 <- df %>% filter(e_regionpol_6C == 3) %>% ggplot(aes(year, v2pasoctie)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat4 <- df %>% filter(e_regionpol_6C == 3) %>% ggplot(aes(year, v2panom))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat5 <- df %>% filter(e_regionpol_6C == 3) %>% ggplot(aes(year, v2padisa))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat6 <- df %>% filter(e_regionpol_6C == 3) %>% ggplot(aes(year, v2paind))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()

grid.arrange(scat1, scat2, scat3, scat4, scat5, scat6, ncol = 3, nrow = 2)
```

### Sub-Saharan Africa

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 4) %>% 
  select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # scatterplots
scat1 <- df %>% filter(e_regionpol_6C == 4) %>% ggplot(aes(year, v2palocoff)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat2 <- df %>% filter(e_regionpol_6C == 4) %>% ggplot(aes(year, v2paactcom)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat3 <- df %>% filter(e_regionpol_6C == 4) %>% ggplot(aes(year, v2pasoctie)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat4 <- df %>% filter(e_regionpol_6C == 4) %>% ggplot(aes(year, v2panom))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat5 <- df %>% filter(e_regionpol_6C == 4) %>% ggplot(aes(year, v2padisa))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat6 <- df %>% filter(e_regionpol_6C == 4) %>% ggplot(aes(year, v2paind))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()

grid.arrange(scat1, scat2, scat3, scat4, scat5, scat6, ncol = 3, nrow = 2)
```

### Western Europe and North America

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 5) %>% 
  select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # scatterplots
scat1 <- df %>% filter(e_regionpol_6C == 5) %>% ggplot(aes(year, v2palocoff)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat2 <- df %>% filter(e_regionpol_6C == 5) %>% ggplot(aes(year, v2paactcom)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat3 <- df %>% filter(e_regionpol_6C == 5) %>% ggplot(aes(year, v2pasoctie)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat4 <- df %>% filter(e_regionpol_6C == 5) %>% ggplot(aes(year, v2panom))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat5 <- df %>% filter(e_regionpol_6C == 5) %>% ggplot(aes(year, v2padisa))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat6 <- df %>% filter(e_regionpol_6C == 5) %>% ggplot(aes(year, v2paind))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()

grid.arrange(scat1, scat2, scat3, scat4, scat5, scat6, ncol = 3, nrow = 2)
```

### Asia and Pacific

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 6) %>% 
  select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # scatterplots
scat1 <- df %>% filter(e_regionpol_6C == 6) %>% ggplot(aes(year, v2palocoff)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat2 <- df %>% filter(e_regionpol_6C == 6) %>% ggplot(aes(year, v2paactcom)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat3 <- df %>% filter(e_regionpol_6C == 6) %>% ggplot(aes(year, v2pasoctie)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat4 <- df %>% filter(e_regionpol_6C == 6) %>% ggplot(aes(year, v2panom))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat5 <- df %>% filter(e_regionpol_6C == 6) %>% ggplot(aes(year, v2padisa))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()
scat6 <- df %>% filter(e_regionpol_6C == 6) %>% ggplot(aes(year, v2paind))    + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal()

grid.arrange(scat1, scat2, scat3, scat4, scat5, scat6, ncol = 3, nrow = 2)
```

##  {.unlisted .unnumbered}

## Coder (Dis-) Agreement

There is no easy way to "test" the validity of the time-series. In [Part Four](#part-four---cross-validation) we investigate V-Party's criterion validity by correlating the items with data from other expert surveys. As they were conducted at very different time points, this gives a good impression. Still, one may also look at coder (dis-) agreement mirroring uncertainty of the judgments. A larger disagreement for earlier years in particular could be interpreted in that experts had more difficulties in assessing parties' organizational capacities for those years than for recent elections. This, in turn, would mean that inferences based on the early elections better be treated with more caution. In any case, and unlike other datasets, V-Party includes uncertainty estimates e.g. variables with suffixes "_codehigh" and "_codelow" representing the upper and lower bound of the measurement model highest posterior density interval (HPD); this makes it easy to account for uncertainty in inferential models.

Below we look at the *range* between the upper and the lower bound of the HPD intervals where a larger range would indicate more disagreement of the ratings, and plot the mean range over time and for subgroups. Overall, we find a modest trend that coder disagreement indeed is slightly higher for ratings of earlier elections. Notably, expert ratings for elite cohesion show a larger range on average, suggesting that coders had more diverging views on what constitutes "enough" internal dissent to choose the next category. The general trend is mainly driven by less disagreement over parties in liberal, Western democracies for more recent time points. In contrast, experts showed more agreement for parties in electoral autocracies in earlier years. Likewise, ratings diverge more for recent years regarding parties' social ties (apart from parties in Latin America and the Caribbean).

Inspired by McMann et al.'s (2021, 13-14) analysis, we further analyze the HDP range in a regression framework. Far from presenting a sophisticated theory on coder (dis-) agreement, we include year, the number of experts that rated an observation, regime type and country dummies as potential covariates to assess whether there is any systematic bias. We do find a statistically significant negative effect of year meaning coder disagreement is lower for more recent years, but only for the items on local offices and active communities. Unsurprisingly, regime type (with closed autocracies being the reference category) has a negative, at times even statistically significant, effect on the HDP range suggesting that experts in electoral and liberal democracies likely can base their judgement on a broader range of information leading to more agreement. Regarding the number of coders we find that the more experts rated a party, the less disagreement there is. In conjunction with particular country effects this makes clear that it is seemingly harder to code parties in certain countries (and recruit enough experts) rather than coding the distant past.

In sum, there is mixed evidence concerning coder (dis-) agreement over time: on the one hand there is a modest overall trend that ratings converge more for recent elections which *may* be interpreted in that retrospective judging is slightly more difficult. However, respondent disagreement is not critically high and there is no systematic bias that would call the data for the 1970s, 1980s, or 1990s into doubt. If anything, our tentative results suggest that experts agree more on parties where the informational environment is more favorable.

```{r, message=FALSE, warning=FALSE, fig.align='center'}

  # Generate difference or range of upper and lower HPD interval; a large range = more disagreement = more uncertainty
df <- 
  df %>% 
  mutate(
    v2palocoff_hdprange = if_else(!is.na(v2palocoff), v2palocoff_codehigh - v2palocoff_codelow, NULL, NULL),
    v2paactcom_hdprange = if_else(!is.na(v2paactcom), v2paactcom_codehigh - v2paactcom_codelow, NULL, NULL),
    v2pasoctie_hdprange = if_else(!is.na(v2pasoctie), v2pasoctie_codehigh - v2pasoctie_codelow, NULL, NULL),
    v2panom_hdprange    = if_else(!is.na(v2panom)   , v2panom_codehigh    - v2panom_codelow   , NULL, NULL),
    v2padisa_hdprange   = if_else(!is.na(v2padisa)  , v2padisa_codehigh   - v2padisa_codelow  , NULL, NULL),
    v2paind_hdprange    = if_else(!is.na(v2paind)   , v2paind_codehigh    - v2paind_codelow   , NULL, NULL)
  )

    # Descriptive stats
df %>%
  select(v2palocoff_hdprange, v2paactcom_hdprange, v2pasoctie_hdprange, v2panom_hdprange, v2padisa_hdprange, v2paind_hdprange) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)


  # Scatterplots
scat1 <- df %>% ggplot(aes(year, v2palocoff_hdprange)) + geom_smooth(size=2) + coord_cartesian(ylim = c(1.0, 1.4)) + theme_minimal()
scat2 <- df %>% ggplot(aes(year, v2paactcom_hdprange)) + geom_smooth(size=2) + coord_cartesian(ylim = c(1.0, 1.4)) + theme_minimal()
scat3 <- df %>% ggplot(aes(year, v2pasoctie_hdprange)) + geom_smooth(size=2) + coord_cartesian(ylim = c(1.0, 1.4)) + theme_minimal()
scat4 <- df %>% ggplot(aes(year, v2panom_hdprange))    + geom_smooth(size=2) + coord_cartesian(ylim = c(1.0, 1.4)) + theme_minimal()
scat5 <- df %>% ggplot(aes(year, v2padisa_hdprange))   + geom_smooth(size=2) + coord_cartesian(ylim = c(1.0, 1.4)) + theme_minimal()
scat6 <- df %>% ggplot(aes(year, v2paind_hdprange))    + geom_smooth(size=2) + coord_cartesian(ylim = c(1.0, 1.4)) + theme_minimal()

grid.arrange(scat1, scat2, scat3, scat4, scat5, scat6, ncol = 3, nrow = 2)
```


### Further analyses {.tabset .tabset-pills}

#### (Dis-) Agreement by regime type

```{r, message=FALSE, warning=FALSE}
  
  # Scatterplots
scat1 <-
  df %>%
  ggplot(aes(year, v2palocoff_hdprange)) + 
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  theme_minimal() + theme(legend.position = "bottom") + 
  scale_color_brewer(palette="Set1") +
  labs(colour = "Regimes of the World (RoW)")


scat2 <-
  df %>%
  ggplot(aes(year, v2paactcom_hdprange)) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat3 <-
  df %>%
  ggplot(aes(year, v2pasoctie_hdprange)) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat4 <-
  df %>%
  ggplot(aes(year, v2panom_hdprange)) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat5 <-
  df %>%
  ggplot(aes(year, v2padisa_hdprange)) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat6 <-
  df %>%
  ggplot(aes(year, v2paind_hdprange)) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

ggarrange(scat1, scat2, scat3, scat4, scat5, scat6, ncol = 3, nrow = 2, common.legend = TRUE, legend="bottom")
```

#### (Dis-) Agreement by region

```{r, message=FALSE, warning=FALSE}
  
  # Scatterplots
scat1 <-
  df %>%
  ggplot(aes(year, v2palocoff_hdprange)) + 
  geom_smooth(aes(colour = factor(e_regionpol_6C_lab)), size=2) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  theme_minimal() + theme(legend.position = "bottom") + 
  scale_color_brewer(palette="Set1") +
  labs(colour = "Socio-Political Region")


scat2 <-
  df %>%
  ggplot(aes(year, v2paactcom_hdprange)) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  geom_smooth(aes(colour = factor(e_regionpol_6C_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat3 <-
  df %>%
  ggplot(aes(year, v2pasoctie_hdprange)) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  geom_smooth(aes(colour = factor(e_regionpol_6C_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat4 <-
  df %>%
  ggplot(aes(year, v2panom_hdprange)) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  geom_smooth(aes(colour = factor(e_regionpol_6C_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat5 <-
  df %>%
  ggplot(aes(year, v2padisa_hdprange)) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  geom_smooth(aes(colour = factor(e_regionpol_6C_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

scat6 <-
  df %>%
  ggplot(aes(year, v2paind_hdprange)) + 
  coord_cartesian(ylim = c(0.9, 1.4)) +
  geom_smooth(aes(colour = factor(e_regionpol_6C_lab)), size=2) +
  scale_color_brewer(palette="Set1") +
  theme_minimal() + theme(legend.position = "none")

ggarrange(scat1, scat2, scat3, scat4, scat5, scat6, ncol = 3, nrow = 2, common.legend = TRUE, legend="bottom")
```

#### Regression results

```{r}
  # Run regression
m1 <- lm(v2palocoff_hdprange ~  year + as.factor(v2x_regime) + as.factor(country_text_id) + v2palocoff_nr, data = df)
m2 <- lm(v2paactcom_hdprange ~  year + as.factor(v2x_regime) + as.factor(country_text_id) + v2paactcom_nr, data = df)
m3 <- lm(v2pasoctie_hdprange ~  year + as.factor(v2x_regime) + as.factor(country_text_id) + v2pasoctie_nr, data = df)
m4 <- lm(v2panom_hdprange    ~  year + as.factor(v2x_regime) + as.factor(country_text_id) + v2panom_nr   , data = df)
m5 <- lm(v2padisa_hdprange   ~  year + as.factor(v2x_regime) + as.factor(country_text_id) + v2padisa_nr  , data = df)
m6 <- lm(v2paind_hdprange    ~  year + as.factor(v2x_regime) + as.factor(country_text_id) + v2paind_nr   , data = df)

  # Obtain clustered standard errors
m1_clse <- coeftest(m1, vcovCL(m1, type='HC1', cluster=~v2paid))[,2]
m2_clse <- coeftest(m2, vcovCL(m2, type='HC1', cluster=~v2paid))[,2]
m3_clse <- coeftest(m3, vcovCL(m3, type='HC1', cluster=~v2paid))[,2]
m4_clse <- coeftest(m4, vcovCL(m4, type='HC1', cluster=~v2paid))[,2]
m5_clse <- coeftest(m5, vcovCL(m5, type='HC1', cluster=~v2paid))[,2]
m6_clse <- coeftest(m6, vcovCL(m6, type='HC1', cluster=~v2paid))[,2]

  # Obtain p-values
m1_pv <- coeftest(m1, vcovCL(m1, type='HC1', cluster=~v2paid))[,4]
m2_pv <- coeftest(m2, vcovCL(m2, type='HC1', cluster=~v2paid))[,4]
m3_pv <- coeftest(m3, vcovCL(m3, type='HC1', cluster=~v2paid))[,4]
m4_pv <- coeftest(m4, vcovCL(m4, type='HC1', cluster=~v2paid))[,4]
m5_pv <- coeftest(m5, vcovCL(m5, type='HC1', cluster=~v2paid))[,4]
m6_pv <- coeftest(m6, vcovCL(m6, type='HC1', cluster=~v2paid))[,4]
```

```{r, results = 'asis', echo = FALSE}
  # Output
htmlreg(l = list(m1, m2, m3, m4, m5, m6), 
          override.se = list(m1_clse, m2_clse, m3_clse, m4_clse, m5_clse, m6_clse), 
          override.pvalues = list(m1_pv, m2_pv, m3_pv, m4_pv, m5_pv, m6_pv),
          custom.model.names = c("v2palocoff", "v2paactcom", "v2pasoctie", "v2panom", "v2padisa", "v2paind"),
          custom.header = list("DV: HDP range of..." = 1:6),
          custom.note = ("%stars. Linear regression coefficients with standard errors clustered on parties in parentheses."),
          reorder.coef = c(2:4, 123, 125, 126, 127, 129, 130, 5:122, 124, 128, 1),
          doctype = FALSE, center = FALSE, caption = "", digits = 3, stars = c(0.001, 0.01, 0.05))
```

##  {.unlisted .unnumbered}

</br>

------------------------------------------------------------------------

# Part Two - Factor Analysis

------------------------------------------------------------------------

</br>

The following graph plots the pairwise correlation of all six items. *For this section, we restrict ourselves to those observations that have complete data for all six items.* Notably, local party offices (*v2palocoff*), active local communities (*v2paactcom*), and ties to social organizations (*v2pasoctie*) correlate quite strongly. Personalization (*v2paind*) and candidate selection (*v2panom*) are strongly but negatively correlated, i.e. the more focused on the party leader the less inclusive is the candidate nomination of a party. In general, elite cohesion (*v2padisa*) is only weakly correlated to the other items. The negative correlation of elite cohesion and candidate nomination indicates that parties with more inclusive nomination rules show a slight tendency to be less cohesive. While speculative for now, including lower cadres in nomination processes might lead to internal competition and thus be exactly the reason why a party is viewed as less cohesive.

```{r, message=FALSE, warning=FALSE, fig.align='center'}
df %>% select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% filter(complete.cases(.)) %>% ggpairs(., diag = list("continuous"="blank"))
```

</br>

Given such intercorrelations among the items, factor analysis is an appropriate tool for uncovering any dimensionality of the data (Comrey and Lee 1992, 4). When conducting an exploratory factor analysis (EFA), a crucial step is determining the number of factors to retain, and several rules-of-thumb and criteria have been proposed (Kahn 2006). As a starting point, we apply the VSS criterion (Very Simple Structure) suggested by Revelle and Rocklin (1979) that compares the fit of a simplified model to the initial correlations. According to the VSS, a two-factor, possibly a three-factor solution is preferable. This comes at no surprise given the pairwise correlations.

```{r, message=FALSE, warning=FALSE}

    # keep only selected vars
df_fa      <- df %>% select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind)    %>% filter(complete.cases(.))
df_fa_main <- df %>% select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paindrev) %>% filter(complete.cases(.)) # final solution includes v2paind reversed 

    # correlation matrix
df_fa_cor      <- cor(df_fa, use = "complete.obs")
df_fa_cor_main <- cor(df_fa_main, use = "complete.obs") # final solution includes v2paind reversed 

    # scree plot
VSS.scree(cor(df_fa_cor), main = "VSS Scree plot")
```

Following Kahn's (2006) advice, we use the principal-axis factoring (PAF) method for extracting the factors and Promax rotation for the main model. We still test alternative specifications below. Regardless of the extraction and rotation method, three things are notable.

-   First, local party offices (*v2palocoff*), active local communities (*v2paactcom*), and ties to social organizations (*v2pasoctie*) always load high on the first factor. Given their already strong pairwise correlations, one can say with good conscience that the first factor captures **organizational extensiveness**.
-   Second, personalization (*v2paind*) and candidate selection (*v2panom*) always constitute a (the) second factor. As *v2panom* and *v2paind* are always negatively correlated, switching the scale of *v2paind* seems advisable. This way, both items capture the **intra-party power concentration** between the lower cadres and the leadership, with low levels describing parties with a hierarchical, centralized decision-making structure.
-   Finally, **elite cohesion** (*v2padisa*) behaves differently from the other items. When opting for a three-factor solution, *v2padisa* usually embodies the third factor. When forcing to a two-factor solution, it loads on the second factor together with personalization and candidate nomination. Its factor loadings are remarkably lower, though. The analysis thus supports Tavits' (2011, 4) notion that cohesion is a distinct aspect of party organizations.

## Main model - Three factor solution

Note that based on our exploratory analysis and the alternative specifications below, we decided to switch the scale of *v2paind* to be in line with *v2panom*, i.e. lower values now indicate that a party is geared towards the leader and s/he can unilaterally decide on candidates. For the main model presented here, we already include the reversed item (*v2paindrev*). For the alternative specifications below we still report the original scale for making our decision transparent.

-   *Number of factors*: 3
-   *Extraction method*: Principal axes (PAF)
-   *Rotation*: Promax

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # efa
factors_data_main <- fa(r = df_fa_main, nfactors = 3, fm = "pa", rotate = "promax")

    # output
out1 <- as.data.frame(factors_data_main$loadings[1:6,])
out2 <- as.data.frame(factors_data_main$communality[1:6]) %>% rename("Communality" = "factors_data_main$communality[1:6]")
out3 <- as.data.frame(factors_data_main$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data_main$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data_main$Vaccounted[1:5,1:3])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

fa.plot(factors_data_main, labels = c("v2palocoff", "v2paactcom", "v2pasoctie", "v2panom", "v2padisa", "v2paindrev"), main = "Factor loadings")
```

## Alternative specifications {.tabset .tabset-pills}

### Alternative I

-   *Number of factors*: 2
-   *Extraction method*: Principal axes (PAF)
-   *Rotation*: Promax

```{r, message=FALSE, warning=FALSE}

    # efa
factors_data <- fa(r = df_fa_cor, nfactors = 2, fm = "pa", rotate = "promax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:2])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### Alternative II

-   *Number of factors*: 3
-   *Extraction method*: Principal axes (PAF)
-   *Rotation*: Oblimin

```{r, message=FALSE, warning=FALSE}

    # efa
factors_data <- fa(r = df_fa_cor, nfactors = 3, fm = "pa", rotate = "oblimin")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:3])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### Alternative III

-   *Number of factors*: 2
-   *Extraction method*: Principal axes (PAF)
-   *Rotation*: Oblimin

```{r, message=FALSE, warning=FALSE}

    # efa
factors_data <- fa(r = df_fa_cor, nfactors = 2, fm = "pa", rotate = "oblimin")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:2])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### Alternative IV

-   *Number of factors*: 3
-   *Extraction method*: OLS Minimum residual
-   *Rotation*: Varimax

```{r, message=FALSE, warning=FALSE}

    # efa
factors_data <- fa(r = df_fa_cor, nfactors = 3, fm = "minres", rotate = "varimax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:3])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### Alternative V

-   *Number of factors*: 2
-   *Extraction method*: OLS Minimum residual
-   *Rotation*: Varimax

```{r, message=FALSE, warning=FALSE}

    # efa
factors_data <- fa(r = df_fa_cor, nfactors = 2, fm = "minres", rotate = "Varimax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:2])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### Alternative VI

-   *Number of factors*: 6
-   *Extraction method*: OLS Minimum residual
-   *Rotation*: Varimax

```{r, message=FALSE, warning=FALSE}

    # efa
factors_data <- fa(r = df_fa_cor, nfactors = 6, fm = "minres", rotate = "Varimax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:6])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

##  {.unlisted .unnumbered}

## Factor analyses using subsets {.tabset .tabset-pills}

As EFA is dependent on the utilized sample, we re-run the main analysis on subsets of the data mirroring sort of a "robustness check". The main three-factor solution is robust when looking at democracies only. Utilizing regional samples corroborates the main analysis, too. For the first factor capturing organizational extensiveness the picture remains quite stable with parties in the Middle East and Northern Africa being the only exemption. Here, ties to social organizations (*v2pasoctie*) constitutes a third factor on its own, while elite cohesion loads on the second factor together with personalization and candidate nomination.

There are also notable differences regarding *v2padisa* again. Looking at electoral autocracies only, it loads high on the second factor together with candidate nomination procedures while personalization constitutes a third factor on its own. When focusing on parties in Eastern Europe and Central Asia, and especially on parties in Latin America and the Caribbean, elite cohesion falls together with personalization and candidate nomination. The reverse is true when focusing on parties in Sub-Saharan Africa, Western Europe and North America, and Asia and Pacific. Here, *v2padisa* emerges as a separate factor. This partly explains its behavior in the main analysis where the effects seem to cancel out each other. In any case, this finding calls for further inspection in future research about the causes of party cohesion in different contexts. For our purpose, however, the overall picture suggests to treat elite cohesion separately from the other two dimensions.

### Electoral and liberal democracies only

```{r, message=FALSE, warning=FALSE}

    # keep only selected vars and filter
df_fa2 <- df %>% filter(v2x_regime >= 2) %>% select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% filter(complete.cases(.))

    # new correlation matrix
df_fa2_cor <- cor(df_fa2, use = "complete.obs")

    # efa
factors_data <- fa(r = df_fa2_cor, nfactors = 3, fm = "pa", rotate = "promax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:3])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### Electoral autocracies only

```{r, message=FALSE, warning=FALSE}

    # keep only selected vars and filter
df_fa2 <- df %>% filter(v2x_regime <= 1) %>% select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% filter(complete.cases(.))

    # new correlation matrix
df_fa2_cor <- cor(df_fa2, use = "complete.obs")

    # efa
factors_data <- fa(r = df_fa2_cor, nfactors = 3, fm = "pa", rotate = "promax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:3])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### Eastern Europe and Central Asia

```{r, message=FALSE, warning=FALSE}

    # keep only selected vars and filter
df_fa2 <- df %>% filter(e_regionpol_6C == 1) %>% select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% filter(complete.cases(.))

    # new correlation matrix
df_fa2_cor <- cor(df_fa2, use = "complete.obs")

    # efa
factors_data <- fa(r = df_fa2_cor, nfactors = 3, fm = "pa", rotate = "promax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:3])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### Latin America and the Caribbean

*Note: factor extraction method switched to OLS minres as PFA triggered an unknown error.*

```{r, message=FALSE, warning=FALSE}

    # keep only selected vars and filter
df_fa2 <- df %>% filter(e_regionpol_6C == 2) %>% select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% filter(complete.cases(.))

    # new correlation matrix
df_fa2_cor <- cor(df_fa2, use = "complete.obs")

    # efa
factors_data <- fa(r = df_fa2_cor, nfactors = 3, fm = "minres", rotate = "promax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:3])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### The Middle East and Northern Africa

```{r, message=FALSE, warning=FALSE}

    # keep only selected vars and filter
df_fa2 <- df %>% filter(e_regionpol_6C == 3) %>% select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% filter(complete.cases(.))

    # new correlation matrix
df_fa2_cor <- cor(df_fa2, use = "complete.obs")

    # efa
factors_data <- fa(r = df_fa2_cor, nfactors = 3, fm = "pa", rotate = "promax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:3])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### Sub-Saharan Africa

*Note: factor extraction method switched to OLS minres as PFA triggered an unknown error.*

```{r, message=FALSE, warning=FALSE}

    # keep only selected vars and filter
df_fa2 <- df %>% filter(e_regionpol_6C == 4) %>% select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% filter(complete.cases(.))

    # new correlation matrix
df_fa2_cor <- cor(df_fa2, use = "complete.obs")

    # efa
factors_data <- fa(r = df_fa2_cor, nfactors = 3, fm = "minres", rotate = "promax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:3])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### Western Europe and North America

```{r, message=FALSE, warning=FALSE}

    # keep only selected vars and filter
df_fa2 <- df %>% filter(e_regionpol_6C == 5) %>% select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% filter(complete.cases(.))

    # new correlation matrix
df_fa2_cor <- cor(df_fa2, use = "complete.obs")

    # efa
factors_data <- fa(r = df_fa2_cor, nfactors = 3, fm = "pa", rotate = "promax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:3])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

### Asia and Pacific

```{r, message=FALSE, warning=FALSE}

    # keep only selected vars and filter
df_fa2 <- df %>% filter(e_regionpol_6C == 6) %>% select(v2palocoff, v2paactcom, v2pasoctie, v2panom, v2padisa, v2paind) %>% filter(complete.cases(.))

    # new correlation matrix
df_fa2_cor <- cor(df_fa2, use = "complete.obs")

    # efa
factors_data <- fa(r = df_fa2_cor, nfactors = 3, fm = "pa", rotate = "promax")

    # output
out1 <- as.data.frame(factors_data$loadings[1:6,])
out2 <- as.data.frame(factors_data$communality[1:6]) %>% rename("Communality" = "factors_data$communality[1:6]")
out3 <- as.data.frame(factors_data$uniquenesses[1:6]) %>% rename("Uniquenesses" = "factors_data$uniquenesses[1:6]")
out4 <- bind_cols(out2, out3, out1)
out5 <- as.data.frame(factors_data$Vaccounted[1:5,1:3])

out4 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
out5 %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)
```

##  {.unlisted .unnumbered}

## Operationalizing Dimensions of Party Organizational Features

Five out of the six items show a clear pattern with extensive local party offices (*v2palocoff*), active local communities (*v2paactcom*), and ties to social organizations (*v2pasoctie*) constituting one dimension. Mirroring the "extensiveness and reach" (Tavits 2011, 4), this dimension captures the **organizational extensiveness** of a party. The lower its score on this first dimension, the less rooted a party would be. Given the mixed evidence for elite cohesion, we opt for a rather minimal definition following Tavit's (2011, 4) suggestion to aim for an operationalization that prevents overstretching the concept and that focuses on organizational structure instead. We therefore exclude *v2padisa* from operationalizing the second dimension. We reverse the scale of *v2paind* (personalization), so that higher values indicate a more collective stance (*v2paindrev*). Together with candidate nomination (*v2panom*) this dimension captures the **intra-party power concentration** between lower cadres and the leadership. A low score on this dimension describes a party with a highly hierarchical, centralized decision-making structure geared towards the will of the leader. Reversely, a high value mirrors a rather collectivist "grassroots democratic" party. The third dimension -- **elite cohesion** -- is simply made up by *v2padisa*.

While numerous approaches exist for weighing and aggregating composite indicators -- each with its own strength and weaknesses (see e.g. Gan et al. 2017; Greco et al. 2019; OECD 2008) -- as a starting point we opt for an additive index applying no weights and using the standardized (z-scores) items to allow for partial substitutability. This is sufficient for our purpose of validating V-Party's party organizational items with extant data. We believe, however, that the availability of V-Party will pave the way for refined analyses of dimensions of party organizational features. Thus said, we operationalize the dimensions as follows:

-   *organizational extensiveness*: orgext = v2palocoffstd + v2paactcomstd + v2pasoctiestd
-   *intra-party power concentration*: powercon = v2paindrevstd + v2panomstd
-   *elite cohesion*: cohesion = v2padisastd

</br>

------------------------------------------------------------------------

# Part Three - Trends & Insights

------------------------------------------------------------------------

In this part, we present descriptive statistics and insights for all three dimensions. Recall that high values for *orgext* indicate a party with strong boots on the ground. In a similar vein, high values on intra-party power concentration (*powercon*) mirror a rather collective, "grassrots democratic" party. Likewise, parties with little internal struggle show a high score on *cohesion*.

Given our global sample, we do not witness a clear trend in parties' organizational extensiveness. Regarding the intra-party power concentration, however, we see a slight shift of party organizations towards more centralized decision making, personalized politics and less cohesion, meaning increasing struggle, in recent years.

</br>

```{r, message=FALSE, warning=FALSE}

    # Descriptive stats
df %>%
  select(orgext, powercon, cohesion) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  
```

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Histogram
hist1 <- df %>% ggplot(aes(x=orgext))   + geom_density(aes(fill = factor(v2x_regime_lab)), alpha=0.3, color="black") + geom_rug(sides = "b", alpha = 0.05) + theme_minimal() + theme(legend.position = "bottom") + scale_color_brewer(palette="Set1") + labs(fill = "Regimes of the World (RoW)")
hist2 <- df %>% ggplot(aes(x=powercon)) + geom_density(aes(fill = factor(v2x_regime_lab)), alpha=0.3, color="black") + geom_rug(sides = "b", alpha = 0.05) + theme_minimal() + theme(legend.position = "none") + scale_color_brewer(palette="Set1")
hist3 <- df %>% ggplot(aes(x=cohesion)) + geom_density(aes(fill = factor(v2x_regime_lab)), alpha=0.3, color="black") + geom_rug(sides = "b", alpha = 0.05) + theme_minimal() + theme(legend.position = "none") + scale_color_brewer(palette="Set1")

ggarrange(hist1, hist2, hist3, ncol = 3, nrow = 1, common.legend = TRUE, legend="bottom")
```

```{r, message=FALSE, warning=FALSE, fig.align='center'}

  # scatterplots
scat1 <-
  df %>%
  ggplot(aes(year, orgext)) + 
  geom_point(aes(colour = factor(v2x_regime_lab)), alpha=0.1) + 
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) + 
  theme_minimal() + theme(legend.position = "bottom") + 
  scale_color_brewer(palette="Set1") +
  labs(x = "", y = "Organizational extensiveness", colour = "Regimes of the World (RoW)")


scat2 <-
  df %>%
  ggplot(aes(year, powercon)) + 
  geom_point(aes(colour = factor(v2x_regime_lab)), alpha=0.1) + 
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  theme_minimal() + theme(legend.position = "none") +
  scale_color_brewer(palette="Set1") +
  labs(x = "", y = "Intra-party power concentration")

scat3 <-
  df %>%
  ggplot(aes(year, cohesion)) + 
  geom_point(aes(colour = factor(v2x_regime_lab)), alpha=0.1) + 
  geom_smooth(aes(colour = factor(v2x_regime_lab)), size=2) +
  theme_minimal() + theme(legend.position = "none") +
  scale_color_brewer(palette="Set1") +
  labs(x = "", y = "Elite cohesion")

ggarrange(scat1, scat2, scat3, ncol = 3, nrow = 1, common.legend = TRUE, legend="bottom")
```

## Regional development of party organizational dimensions {.tabset .tabset-pills}

Looking at the development across regions reveals interesting patterns (horizontal lines in the plots are quantiles). Generally speaking, parties in Western Europe and North America are more rooted in society and tend to give the "party on the ground" quite some influence in internal politics. Notably, parties in the Middle East and Northern Africa show a comparable level of organizational extensiveness; yet, their internal power house clearly rests with the leaders. Parties in both regions show surprisingly similar levels of elite cohesion. The reverse is generally true for parties in Latin America and the Caribbean. Here, parties combine less rootedness with more power for the lower cadres and members. Parties in Sub-Saharan Africa, Asia and Pacific are quite evenly distributed on both dimensions. Such notable differences warrant further inspection and future analyses on the particular causes of differences *between* countries and regions, but also *within* regions and parties.

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # distribution of societal orgext by region
dist1 <- ggplot(df, aes(x = orgext, y = as.factor(e_regionpol_6C))) + 
        geom_density_ridges(fill="cornflowerblue", alpha=0.3, rel_min_height = 0.01, scale = 0.9, quantile_lines = TRUE, quantiles = 4) +
        theme_minimal() +
        scale_y_discrete("",
    		    labels=c("1" = "Eastern Europe and Central Asia",
    				 "2" = "Latin America and the Caribbean",
    				 "3" = "The Middle East and Northern Africa",
    				 "4" = "Sub-Saharan Africa", 
    				 "5" = "Western Europe and North America",
    				 "6" = "Asia and Pacific"))

    # distribution of balance of power by region
dist2 <- ggplot(df, aes(x = powercon, y = as.factor(e_regionpol_6C))) + 
        geom_density_ridges(fill="cornflowerblue", alpha=0.3, rel_min_height = 0.01, scale = 0.9, quantile_lines = TRUE, quantiles = 4) +
        theme_minimal() +
        scale_y_discrete("",
    		    labels=c("1" = "Eastern Europe and Central Asia",
    				 "2" = "Latin America and the Caribbean",
    				 "3" = "The Middle East and Northern Africa",
    				 "4" = "Sub-Saharan Africa", 
    				 "5" = "Western Europe and North America",
    				 "6" = "Asia and Pacific"))

    # distribution of balance of power by region
dist3 <- ggplot(df, aes(x = cohesion, y = as.factor(e_regionpol_6C))) + 
        geom_density_ridges(fill="cornflowerblue", alpha=0.3, rel_min_height = 0.01, scale = 0.9, quantile_lines = TRUE, quantiles = 4) +
        theme_minimal() +
        scale_y_discrete("",
    		    labels=c("1" = "Eastern Europe and Central Asia",
    				 "2" = "Latin America and the Caribbean",
    				 "3" = "The Middle East and Northern Africa",
    				 "4" = "Sub-Saharan Africa", 
    				 "5" = "Western Europe and North America",
    				 "6" = "Asia and Pacific"))
print(dist1)
print(dist2)
print(dist3)
```

</br>

### Eastern Europe and Central Asia

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 1) %>% 
  select(orgext, powercon, cohesion) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # Scatterplot
scat1 <- df %>% filter(e_regionpol_6C == 1) %>% ggplot(aes(year, orgext))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat2 <- df %>% filter(e_regionpol_6C == 1) %>% ggplot(aes(year, powercon)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat3 <- df %>% filter(e_regionpol_6C == 1) %>% ggplot(aes(year, cohesion)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
```

### Latin America and the Caribbean

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 2) %>%
  select(orgext, powercon, cohesion) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # Scatterplot
scat1 <- df %>% filter(e_regionpol_6C == 2) %>% ggplot(aes(year, orgext))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat2 <- df %>% filter(e_regionpol_6C == 2) %>% ggplot(aes(year, powercon)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat3 <- df %>% filter(e_regionpol_6C == 2) %>% ggplot(aes(year, cohesion)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
```

### The Middle East and Northern Africa

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 3) %>%
  select(orgext, powercon, cohesion) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # Scatterplot
scat1 <- df %>% filter(e_regionpol_6C == 3) %>% ggplot(aes(year, orgext))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat2 <- df %>% filter(e_regionpol_6C == 3) %>% ggplot(aes(year, powercon)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat3 <- df %>% filter(e_regionpol_6C == 3) %>% ggplot(aes(year, cohesion)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
```

### Sub-Saharan Africa

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 4) %>%
  select(orgext, powercon, cohesion) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # Scatterplot
scat1 <- df %>% filter(e_regionpol_6C == 4) %>% ggplot(aes(year, orgext))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat2 <- df %>% filter(e_regionpol_6C == 4) %>% ggplot(aes(year, powercon)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat3 <- df %>% filter(e_regionpol_6C == 4) %>% ggplot(aes(year, cohesion)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
```

### Western Europe and North America

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 5) %>%
  select(orgext, powercon, cohesion) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # Scatterplot
scat1 <- df %>% filter(e_regionpol_6C == 5) %>% ggplot(aes(year, orgext))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat2 <- df %>% filter(e_regionpol_6C == 5) %>% ggplot(aes(year, powercon)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat3 <- df %>% filter(e_regionpol_6C == 5) %>% ggplot(aes(year, cohesion)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
```

### Asia and Pacific

```{r, message=FALSE, warning=FALSE, fig.align='center'}
    # Descriptive stats
df %>%
  filter(e_regionpol_6C == 6) %>%
  select(orgext, powercon, cohesion) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)  

  # Scatterplot
scat1 <- df %>% filter(e_regionpol_6C == 6) %>% ggplot(aes(year, orgext))   + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat2 <- df %>% filter(e_regionpol_6C == 6) %>% ggplot(aes(year, powercon)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 
scat3 <- df %>% filter(e_regionpol_6C == 6) %>% ggplot(aes(year, cohesion)) + geom_point(alpha=0.1) + geom_smooth(size=2) + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
```

##  {.unlisted .unnumbered}

</br>

------------------------------------------------------------------------

# Part Four - Cross-Validation

------------------------------------------------------------------------

While party scholars were interested in party policy stances ever since, data on party organizational characteristics is sparse. Few other expert surveys contain data on parties' internal lives. Although there is no 1:1 correspondence to the V-Party items, we cross-validate those questions that -- at least -- have a partial overlap or aim at similar characteristics. In addition, V-Party experts were asked to rate each party on an election-year basis. Other surveys often contain a reference year which, however, seldom matches an election year. Furthermore, for some older expert surveys it is not always clear to which time frame the data refers. For this reason, we perform a "fuzzy match" of those V-Party election years that lie within a range of minus five years from the publication or reference date of the corresponding source. Data for the validation was taken from the "Democratic Accountability and Linkages Project. 2008-9 Dataset" (Kitschelt 2013), the "Political Parties Database Project" (PPDB) (Poguntke et al., 2020) and Giger and Schumacher's (2015) compilation of a diverse set of older expert surveys in their "Integrated Party Organization Dataset" (IPOD). Below, we describe the selected items and report descriptive statistics and correlations for overlapping cases. We also indicate the expected sign of the correlation depending on the original scales of the matched items in brackets.

Overall, we find a decent level of convergent validity. Keeping in mind the (very) different wordings of questions, response categories or operationalizations and keeping in mind the necessity to "fuzzy match" the data due to deviations in the reference time, one would not expect too much of an overlap. Still, V-Party data aligns well with extant data on party organizational characteristics from a broad range of recent surveys as well as older ones lending support for the validity of V-Party data even when experts were asked to judge party features looking back to the 1970s.

## Cross-validating V-Party with... {.tabset .tabset-pills}

### Rohrschneider and Whitefield (2012)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

  # Fuzzy match
match1 <- crossv_rw12 %>% select(v2paid, year, q21, q22, q23iaimp, q23ibimp, q23icimp)
match2 <- df %>% select(v2paid, year, v2paactcom, v2pasoctie, v2panom, v2paind, orgext, powercon)

overlap <- match1 %>% fuzzy_left_join(match2, by = c("v2paid" = "v2paid", "year" = "year"), match_fun = list(`==`, `>=`)) %>% filter(year.x - 5 <= year.y)
```

-   Number of "fuzzy matching" observations: `r nrow(overlap)`
-   Number of overlapping, unique parties: `r overlap %>% select(v2paid.x) %>% distinct() %>% nrow()`

</br>

#### Organizational extensiveness

-   *Original question*: q21 "Does the party have a 'significant' membership base in terms of numbers?"
-   *"Matching" items*: v2paactcom (+), orgext (+)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(q21, v2paactcom, orgext) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(q21, v2paactcom)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(q21, orgext))     + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

grid.arrange(scat1, scat2, ncol = 2, nrow = 1)
```

</br>

-   *Original question*: q22 "Does the party have an organisational affiliation with any interest group or civil society group, such as trade unions, business associations, church groups, etc?"
-   *"Matching" items*: v2pasoctie (+), orgext (+)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(q22, v2pasoctie, orgext) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(q22, v2pasoctie)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(q22, orgext))     + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

grid.arrange(scat1, scat2, ncol = 2, nrow = 1)
```

</br>

#### Intra-party power concentration

-   *Original question*: q23iaimp "The extent to which party membership is strong in determining party policy?"
-   *Original question*: q23ibimp "The extent to which party apparatus is strong in determining party policy?"
-   *Original question*: q23icimp "The extent to which party leadership is strong in determining party policy?"
-   *"Matching" items*: v2panom (+ o -), v2paind (- o +), powercon (+ o -)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(q23iaimp, q23ibimp, q23icimp, v2panom, v2paind, powercon) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(q23iaimp, v2panom))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(q23ibimp, v2panom))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat3 <- overlap %>% ggplot(aes(q23icimp, v2panom))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

scat4 <- overlap %>% ggplot(aes(q23iaimp, v2paind))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat5 <- overlap %>% ggplot(aes(q23ibimp, v2paind))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat6 <- overlap %>% ggplot(aes(q23icimp, v2paind))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

scat7 <- overlap %>% ggplot(aes(q23iaimp, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat8 <- overlap %>% ggplot(aes(q23ibimp, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat9 <- overlap %>% ggplot(aes(q23icimp, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
grid.arrange(scat4, scat5, scat6, ncol = 3, nrow = 1)
grid.arrange(scat7, scat8, scat9, ncol = 3, nrow = 1)
```

### Laver and Hunt (1992)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

  # Fuzzy match
match1 <- crossv_lh92 %>% select(v2paid, year, LeadsIPPMS, LegisIPPMS, ActsIPPMS, LeadsCPMS, LegisCPMS, ActsCPMS)
match2 <- df %>% select(v2paid, year, v2panom, v2paind, powercon)

overlap <- match1 %>% fuzzy_left_join(match2, by = c("v2paid" = "v2paid", "year" = "year"), match_fun = list(`==`, `>=`)) %>% filter(year.x - 5 <= year.y)
overlap <- overlap %>% filter(LeadsCPMS < 7 & LegisCPMS < 28) # Omit outliers
```

-   Number of "fuzzy matching" observations: `r nrow(overlap)`
-   Number of overlapping, unique parties: `r overlap %>% select(v2paid.x) %>% distinct() %>% nrow()`

</br>

#### Intra-party power concentration

-   *Original question*: LeadsIPPMS "Leadership influence on party policy - mean scores"
-   *Original question*: LegisIPPMS "Legislators influence on party policy - mean scores"
-   *Original question*: ActsIPPMS "Activists' influence on party policy - mean scores"
-   *"Matching" items*: v2panom (- o +), v2paind (+ o -), powercon (- o +)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(LeadsIPPMS, LegisIPPMS, ActsIPPMS, v2panom, powercon) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(LeadsIPPMS, v2panom)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(LegisIPPMS, v2panom)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat3 <- overlap %>% ggplot(aes(ActsIPPMS,  v2panom)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

scat7 <- overlap %>% ggplot(aes(LeadsIPPMS, v2paind)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat8 <- overlap %>% ggplot(aes(LegisIPPMS, v2paind)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat9 <- overlap %>% ggplot(aes(ActsIPPMS,  v2paind)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

scat13 <- overlap %>% ggplot(aes(LeadsIPPMS, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat14 <- overlap %>% ggplot(aes(LegisIPPMS, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat15 <- overlap %>% ggplot(aes(ActsIPPMS,  powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
grid.arrange(scat7, scat8, scat9, ncol = 3, nrow = 1)
grid.arrange(scat13, scat14, scat15, ncol = 3, nrow = 1)
```

</br>

-   *Original question*: LeadsCPMS "Leadership influence on cabinet participation - mean scores"
-   *Original question*: LegisCPMS "Legislators influence on cabinet participation - mean scores"
-   *Original question*: ActsCPMS "Activists' influence on cabinet participation - mean scores"
-   *"Matching" items*: v2panom (- o +), v2paind (+ o -), powercon (- o +)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(LeadsCPMS, LegisCPMS, ActsCPMS, v2panom, powercon) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat4 <- overlap %>% ggplot(aes(LeadsCPMS,  v2panom)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat5 <- overlap %>% ggplot(aes(LegisCPMS,  v2panom)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat6 <- overlap %>% ggplot(aes(ActsCPMS,   v2panom)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

scat10 <- overlap %>% ggplot(aes(LeadsCPMS,  v2paind)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat11 <- overlap %>% ggplot(aes(LegisCPMS,  v2paind)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat12 <- overlap %>% ggplot(aes(ActsCPMS,   v2paind)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

scat16 <- overlap %>% ggplot(aes(LeadsCPMS, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat17 <- overlap %>% ggplot(aes(LegisCPMS, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat18 <- overlap %>% ggplot(aes(ActsCPMS,  powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

grid.arrange(scat4, scat5, scat6, ncol = 3, nrow = 1)
grid.arrange(scat10, scat11, scat12, ncol = 3, nrow = 1)
grid.arrange(scat16, scat17, scat18, ncol = 3, nrow = 1)
```

### Schumacher and Giger (2017)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

  # Fuzzy match
match1 <- crossv_gs15 %>% select(v2paid, year, janda_bopla, lh_bopla, rw_bopla, lhrw_bopla, bopla)
match2 <- df %>% select(v2paid, year, v2paind, powercon)

overlap <- match1 %>% fuzzy_left_join(match2, by = c("v2paid" = "v2paid", "year" = "year"), match_fun = list(`==`, `>=`)) %>% filter(year.x - 5 <= year.y)
```

-   Number of "fuzzy matching" observations: `r nrow(overlap)`
-   Number of overlapping, unique parties: `r overlap %>% select(v2paid.x) %>% distinct() %>% nrow()`

</br>

#### Intra-party power concentration

-   *Original question*: Combining different items from other expert survey (Janda 1980, Harmel and Janda 1994, Laver and Hunt 1992 and Rohrschneider and Whitefield 2012), Schumacher and Giger (2017, 168-70) construct several indices to capture "leadership domination", theoretically ranging between −1 (activist-dominated) and 1 (leadership-dominated).
-   *"Matching" items*: v2paind (+), powercon (-)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(janda_bopla, lh_bopla, rw_bopla, lhrw_bopla, bopla, v2paind, powercon) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(janda_bopla, v2paind)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(lh_bopla,    v2paind)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat3 <- overlap %>% ggplot(aes(rw_bopla,    v2paind)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat4 <- overlap %>% ggplot(aes(lhrw_bopla,  v2paind)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat5 <- overlap %>% ggplot(aes(bopla,       v2paind)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

scat6 <- overlap %>% ggplot(aes(janda_bopla, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat7 <- overlap %>% ggplot(aes(lh_bopla,    powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat8 <- overlap %>% ggplot(aes(rw_bopla,    powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat9 <- overlap %>% ggplot(aes(lhrw_bopla,  powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat10 <- overlap %>% ggplot(aes(bopla,      powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

grid.arrange(scat1, scat6, scat2, scat7, scat3, scat8, scat4, scat9, scat5, scat10, ncol = 2, nrow = 5)
#grid.arrange(scat6, scat7, scat8, scat9, scat10, ncol = 3, nrow = 2)
```

### Harmel and Janda (1994)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

  # Fuzzy match
match1 <- crossv_hj95 %>% select(v2paid, year, candsel)
match2 <- df %>% select(v2paid, year, v2panom, v2paind, powercon)

overlap <- match1 %>% fuzzy_left_join(match2, by = c("v2paid" = "v2paid", "year" = "year"), match_fun = list(`==`, `>=`)) %>% filter(year.x - 5 <= year.y)
```

-   Number of "fuzzy matching" observations: `r nrow(overlap)`
-   Number of overlapping, unique parties: `r overlap %>% select(v2paid.x) %>% distinct() %>% nrow()`

</br>

#### Intra-party power concentration

-   *Original question*: Harmel and Janda asked about the way candidates are selected ranging from 1 (Party 'supporters' are responsible for nominating the parliamentary candidates) to 8 (The national party organization nominates the parliamentary candidates).
-   *"Matching" items*: v2panom (-), v2paind (+), powercon (-)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(candsel, v2panom, v2paind, powercon) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(candsel, v2panom))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(candsel, v2paind))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat3 <- overlap %>% ggplot(aes(candsel, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
```

### Kitschelt (2013)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

  # Fuzzy match
match1 <- crossv_dalp %>% select(v2paid, year, a1, a5, a5a, a6, a6a)
match2 <- df %>% select(v2paid, year, v2palocoff, v2panom, v2paind, orgext, powercon)

overlap <- match1 %>% fuzzy_left_join(match2, by = c("v2paid" = "v2paid", "year" = "year"), match_fun = list(`==`, `>=`)) %>% filter(year.x - 5 <= year.y)
```

-   Number of "fuzzy matching" observations: `r nrow(overlap)`
-   Number of overlapping, unique parties: `r overlap %>% select(v2paid.x) %>% distinct() %>% nrow()`

</br>

#### Organizational extensiveness

-   *Original question*: a1 "Do the following parties or their individual candidates maintain offices and paid staff at the local or municipal level? If yes, are these offices and staff permanent or only during national elections?" ranging from 1 (Yes, the party maintains permanent local offices in MOST districts) to 4 (No, the party does not maintain local offices).
-   *"Matching" items*: v2palocoff (-), orgext (-)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(a1, v2palocoff, orgext) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(a1, v2palocoff)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(a1, orgext))     + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

grid.arrange(scat1, scat2, ncol = 2, nrow = 1)
```

</br>

#### Intra-party power concentration

-   *Original question*: a5 "Which of the following four options best describes the following parties' balance of power in selecting candidates for national legislative elections?" ranging from 1 (National party leaders control the process of candidate selections) over 3 (Local/municipal actors control the process of candidate selections) to 4 (Selection is the outcome of bargaining between different levels).
-   *"Matching" items*: v2panom (+), powercon (+)

*Note*: Response categories 1 to 3 reflect an ordering, but 4 does not. Below we therefore plot the correlation excluding the latter observations which corroborates the correspondence.

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(a5, v2panom, powercon) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(a5, v2panom))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(a5, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

scat3 <- overlap %>% filter(a5 <= 3) %>% ggplot(aes(a5, v2panom))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat4 <- overlap %>% filter(a5 <= 3) %>% ggplot(aes(a5, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

grid.arrange(scat1, scat2, ncol = 2, nrow = 1)
grid.arrange(scat3, scat4, ncol = 2, nrow = 1)
```

</br>

-   *Original question*: a6 "Similarly, which of the following options best characterizes the process by which the following parties decide on electoral strategy, for example campaign platforms and slogans, coalition strategies, and campaign resource allocations?" ranging from 1 (Electoral Strategy is chosen by national party leaders with little participation from local or state level organizations) over 3 (Electoral strategy is chosen by local or municipal level actors) to 4 (The choice of electoral strategy is the outcome of bargaining between the different levels of party organization).
-   *"Matching" items*: v2paind (-), powercon (+)

*Note*: Response categories 1 to 3 reflect an ordering, but 4 does not. Below we therefore plot the correlation excluding the latter observations which corroborates the correspondence.

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(a6, v2paind, powercon) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(a6, v2paind))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(a6, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

scat3 <- overlap %>% filter(a6 <= 3) %>% ggplot(aes(a6, v2paind))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat4 <- overlap %>% filter(a6 <= 3) %>% ggplot(aes(a6, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

grid.arrange(scat1, scat2, ncol = 2, nrow = 1)
grid.arrange(scat3, scat4, ncol = 2, nrow = 1)
```

### Janda (1980)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

  # Fuzzy match
match1 <- crossv_ja80 %>% select(v2paid, year,leadfact, strafact, ideofac, leadcon, formulpol, selparliacan, natofstruc, selnatlead, exnorg)
match2 <- df %>% select(v2paid, year, v2padisa, v2paind, v2panom, v2palocoff, powercon, orgext)

overlap <- match1 %>% fuzzy_left_join(match2, by = c("v2paid" = "v2paid", "year" = "year"), match_fun = list(`==`, `>=`)) %>% filter(year.x - 5 <= year.y)
```

-   Number of "fuzzy matching" observations: `r nrow(overlap)`
-   Number of overlapping, unique parties: `r overlap %>% select(v2paid.x) %>% distinct() %>% nrow()`

</br>

#### Organizational extensiveness

-   *Original question*: Janda asked about the nationalization of the structure ranging from 0 (Local organizations, defined as constituency/municipal/commune/county level or lower are the only discernible structural element in the party) to 6 (There is a discernible party hierarchy that has a single national council or executive committee at the top acting directly on the local organizations).
-   *Original question*: Janda asked about extensiveness of organizations ranging from 0 (Either there are no identifiable party organs or the only organs that can be identified are national organs) to 6 (The most intensive level of organization for the party can be found throughout the country).
-   *"Matching" items*: v2palocoff (- +), orgext (- +)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(natofstruc, exnorg, v2palocoff, orgext) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(natofstruc, v2palocoff)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(natofstruc, orgext))     + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 

scat3 <- overlap %>% ggplot(aes(exnorg, v2palocoff)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat4 <- overlap %>% ggplot(aes(exnorg, orgext))     + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 

grid.arrange(scat1, scat2, ncol = 2, nrow = 1)
grid.arrange(scat3, scat4, ncol = 2, nrow = 1)
```

</br>

#### Intra-party power concentration

-   *Original question*: Janda asked about the leadership concentration ranging from 0 (Leadership is so dispersed that only local or regional leaders can be identified) to 6 (Leadership is exercised by one individual who can personally commit the party to binding courses of action).
-   *Original question*: Janda asked about formulating policies ranging from 0 (Responsibility for formulating policy is diffused throughout the party) to 7 (Major policy positions are determined and announced by the party leader or a small subgroup of the national committee).
-   *"Matching" items*: v2paind (+ +), powercon (- -)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(leadcon, formulpol, v2paind, powercon) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(leadcon, v2paind))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(leadcon, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 

scat3 <- overlap %>% ggplot(aes(formulpol, v2paind))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat4 <- overlap %>% ggplot(aes(formulpol, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 

grid.arrange(scat1, scat2, ncol = 2, nrow = 1)
grid.arrange(scat3, scat4, ncol = 2, nrow = 1)
```

</br>

-   *Original question*: Janda asked about selecting parliamentary candidates ranging from 0 (Nominations are determined locally by vote or party supporters) to 7 (Selection is determined by a national committee or party council).
-   *Original question*: Janda asked about selecting a national leader ranging from 0 (No national party leader can be identified) to 8 (He is selected by his predecessor).
-   *"Matching" items*: v2panom (- -), v2paind (+ +), powercon (- -)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(selparliacan, selnatlead, v2panom, v2paind, powercon) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(selparliacan, v2panom))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(selparliacan, v2paind))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat3 <- overlap %>% ggplot(aes(selparliacan, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 

scat4 <- overlap %>% ggplot(aes(selnatlead, v2panom))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat5 <- overlap %>% ggplot(aes(selnatlead, v2paind))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat6 <- overlap %>% ggplot(aes(selnatlead, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
grid.arrange(scat4, scat5, scat6, ncol = 3, nrow = 1)
```

</br>

#### Elite cohesion

-   *Original question*: Janda asked about leadership factionalization ranging from 0 (Leadership contests for control of the party either do not occur or they are so covert or so "inside" that they do not engage large number of party members in their outcome) to 6 (Followers of a political leader have created a 'large' faction within the party with some formal organization of its own).
-   *Original question*: Janda asked about struggle over goals ranging from 0 (There is little or no disagreement voiced within the party concerning its appropriate strategy or tactics with regard to its goal orientation) to 6 (Adherents to a certain line of strategy or tactics have created a 'large' faction within the party with some formal organization of its own).
-   *Original question*: Janda asked about struggle over ideology ranging from 0 (Ideological concerns are not subject to public debate and disagreement among party leaders) to 6 (Ideological concerns have created a "large" faction within the party with some formal organization of its own).
-   *"Matching" items*: v2padisa (- - -)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(leadfact, strafact, ideofac, v2padisa) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(leadfact, v2padisa)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(strafact, v2padisa)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat3 <- overlap %>% ggplot(aes(ideofac,  v2padisa)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
```

### Poguntke et al. (2020)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

  # Fuzzy match
match1 <- crossv_ppdb %>% select(v2paid, YEAR, CR16ALLMBR, A46LOWLVL, A48LOWNUM, A56UNIONNUM, A57CORPNUM, CON_AFFIL_ORGA, LEADER_STRENGTH) %>% rename(year = YEAR)
match2 <- df %>% select(v2paid, year, v2palocoff, v2paactcom, v2pasoctie, v2panom, v2paind, orgext, powercon)

overlap <- match1 %>% fuzzy_left_join(match2, by = c("v2paid" = "v2paid", "year" = "year"), match_fun = list(`==`, `>=`)) %>% filter(year.x - 5 <= year.y)
```

-   Number of "fuzzy matching" observations: `r nrow(overlap)`
-   Number of overlapping, unique parties: `r overlap %>% select(v2paid.x) %>% distinct() %>% nrow()`

</br>

#### Organizational extensiveness

-   *Original question*: CR16ALLMBR "Total number of individual plus corporate (indirect) members".
-   *"Matching" items*: v2paactcom (+), orgext (+)

*Note*: Number of members are filtered for outliers \> 100.000.

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(CR16ALLMBR, v2paactcom, orgext) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% filter(CR16ALLMBR <= 100000) %>% ggplot(aes(CR16ALLMBR, v2paactcom)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat2 <- overlap %>% filter(CR16ALLMBR <= 100000) %>% ggplot(aes(CR16ALLMBR, orgext))     + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

grid.arrange(scat1, scat2, ncol = 2, nrow = 1)
```

</br>

-   *Original question*: A48LOWNUM Poguntke et al. asked for the number of the smallest units represented at higher levels.
-   *"Matching" items*: v2palocoff (+), orgext (+)

*Note*: The number of units is filtered for outliers \> 750.

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(A48LOWNUM, v2palocoff, orgext) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% filter(A48LOWNUM < 750) %>% ggplot(aes(A48LOWNUM, v2palocoff)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 
scat2 <- overlap %>% filter(A48LOWNUM < 750) %>% ggplot(aes(A48LOWNUM, orgext))     + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "pearson", p.accuracy = 0.001, r.accuracy = 0.001) + theme_minimal() 

grid.arrange(scat1, scat2, ncol = 2, nrow = 1)
```

</br>

-   *Original question*: A58CONWOMEN to A66CONCORP Poguntke et al. asked whether nine different groups or affiliated organizations such as women's suborganizations, trade unions or affiliated business peak associations may send delegates to the party congress, or not. From this, an additive index is constructed, theoretically ranging from 0 to 9.
-   *"Matching" items*: v2paactcom (+), v2pasoctie (+), orgext (+)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(CON_AFFIL_ORGA, v2paactcom, v2pasoctie, orgext) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(CON_AFFIL_ORGA, v2paactcom)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(CON_AFFIL_ORGA, v2pasoctie)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat3 <- overlap %>% ggplot(aes(CON_AFFIL_ORGA, orgext))     + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
```

</br>

#### Intra-party power concentration

-   *Original question*: C11DEPUTY to C19LDRROLE3 Poguntke et al. asked for the statutory power of party leaders on nine issues. From this, an additive index is constructed, theoretically ranging from 0 to 9.
-   *"Matching" items*: v2panom (-), v2paind (+), powercon (-)

```{r, message=FALSE, warning=FALSE, fig.align='center'}

    # Descriptive stats
overlap %>%
  select(LEADER_STRENGTH, v2panom, v2paind, powercon) %>% 
  my_skim() %>%
  yank(., "numeric") %>% 
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = T)

  # Scatterplot
scat1 <- overlap %>% ggplot(aes(LEADER_STRENGTH, v2panom))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat2 <- overlap %>% ggplot(aes(LEADER_STRENGTH, v2paind))  + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 
scat3 <- overlap %>% ggplot(aes(LEADER_STRENGTH, powercon)) + geom_point(alpha=0.1) + geom_smooth(method = "lm", size=2) + stat_cor(method = "spearman", p.accuracy = 0.001, r.accuracy = 0.001, cor.coef.name = "rho") + theme_minimal() 

grid.arrange(scat1, scat2, scat3, ncol = 3, nrow = 1)
```

##  {.unlisted .unnumbered}

</br>

------------------------------------------------------------------------

# Part Five - Survival Analysis

------------------------------------------------------------------------

In line with Adcock and Collier's (2001, 542) "AHEM validation; that is, 'Assume the Hypothesis, Evaluate the Measure'", we inspect the construct validity of V-Party measures by examining the association between party organizational features and the survival of political parties. Durability is captured in terms of the number of consecutive elections while we define "party death" as the dependent variable if a party's vote share in national legislative elections falls below 5 percent. For analyzing the impact of extensive organizations, devolved decision-making structures, and elite cohesion on party durability, we employ a discreet event history modeling framework (Box-Steffensmeier and Jones 2004). Below we report the regression table for models 1 and 2 of the main text incl. alternative specifications of the random effects. Furthermore, we apply the main model to subgroups and test the impact of single items instead of the aggregated dimensions. Generally speaking, the analyses support our main analysis and lend further support for the nomological validity of V-Party data.

While the ordering remains stable, looking at each regime type separately alters the results in that organizational extensiveness is less important for party durability in liberal democracies. Instead, elite cohesion and the internal balance of power play a more pronounced role, i.e. parties geared towards the leader have a higher risk of "dying" in liberal democracies compared to their counterparts in electoral democracies or electoral autocracies.

Looking at regions and decades separately reveals notable patterns that call for further investigation: organizational extensiveness is a good remedy against party death across all regions and all times (although the effect size is less pronounced in recent times). Giving lower party cadres more say in internal politics positively affects the durability of parties foremost in Latin America and the Caribbean and Western Europe and North America, but has a negative effect on party persistence in Sub-Saharan Africa (the effect is statistically not significant though). Especially the 1980s and 2010s but less so the other decades see some support that hierarchical parties tend to "die" earlier. Struggle at the elite level in turn is a much stronger predictor of an early party death in Eastern Europe and Central Asia compared to Latin America and the Caribbean or Asia and Pacific. Furthermore, the effect is stronger in the 1990s and 2000s than in most recent times.

There are no surprises regarding the disaggregated items: The results further highlight the importance of having an active presence of party activists and personnel in local communities (i.e. local organizational strength) and an extensive, nationwide network of local party branches while ties to social organizations and groups are slightly less important. Turning to the intra-party power balance, inclusive nomination procedures have a slightly stronger effect on party persistence than the personalization of a party.

## Regression results {.tabset .tabset-pills}

### Main analysis

```{r}
    # define common sample as some obs do not have complete data e.g. due to lags
df_analysis <- 
    df %>%
    filter(v2x_regime != 0) %>% 
    select(death1, orgext, powercon, v2padisa, statefund, logconsecel, v2x_polyarchy, parl, proportional, mixed, logcounter1, 
           country_id, v2paid, election_id, e_regionpol_6C, year, v2x_regime,
           v2palocoff, v2paactcom, v2pasoctie, v2panom, v2paindrev) %>%
    filter(complete.cases(.))
  

    # base model
m1 <- glmer(death1 ~ orgext + powercon + v2padisa + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis)

    # model including controls
m2 <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis)

    # base model, alternative RE: parties and elections
m3 <- glmer(death1 ~ orgext + powercon + v2padisa + logcounter1 + (1|election_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis)

    # model including controls, alternative RE: parties and elections
m4 <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|election_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis)

```

```{r, results = 'asis', echo = FALSE}
  # Output
htmlreg(l = list(m1, m2, m3, m4), doctype = FALSE, center = FALSE, caption = "", digits = 3, stars = c(0.001, 0.01, 0.05, 0.1),
        custom.model.names = c("Model 1 (main text)", "Model 2 (main text)", "M1 (alt. RE: elections)", "M2 (alt. RE: elections)"),
        custom.note = paste("%stars. Random intercepts logistic regression coefficients with standard errors in parentheses."))
```

### Subsets: Regime types

```{r}
    # electoral autocracies only
df_analysis_ea <- df_analysis %>% filter(v2x_regime == 1)
m1_ea <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_ea)

    # electoral democracies only
df_analysis_ed <- df_analysis %>% filter(v2x_regime == 2)
m2_ed <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_ed)

    # liberal democracies only
df_analysis_ld <- df_analysis %>% filter(v2x_regime == 3)
m3_ld <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_ld)
```

```{r, results = 'asis', echo = FALSE}
htmlreg(l = list(m1_ea, m2_ed, m3_ld), doctype = FALSE, center = FALSE, caption = "", digits = 3, stars = c(0.001, 0.01, 0.05, 0.1),
        custom.model.names = c("Electoral autocracies", "Electoral democracies", "Liberal democracies"))
```

### Subsets: Region

```{r}
    # Eastern Europe and Central Asia
df_analysis_r <- df_analysis %>% filter(e_regionpol_6C == 1)
m1_r <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_r)

  # Latin America and the Caribbean
df_analysis_r <- df_analysis %>% filter(e_regionpol_6C == 2)
m2_r <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_r)

  # The Middle East and Northern Africa
df_analysis_r <- df_analysis %>% filter(e_regionpol_6C == 3)
m3_r <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_r)

  # Sub-Saharan Africa
df_analysis_r <- df_analysis %>% filter(e_regionpol_6C == 4)
m4_r <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_r)

  # Western Europe and North America
df_analysis_r <- df_analysis %>% filter(e_regionpol_6C == 5)
m5_r <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_r)

  # Asia and Pacific
df_analysis_r <- df_analysis %>% filter(e_regionpol_6C == 6)
m6_r <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_r)

```

```{r, results = 'asis', echo = FALSE}
  # Output
htmlreg(l = list(m1_r, m2_r, m3_r, m4_r, m5_r, m6_r), doctype = FALSE, center = FALSE, caption = "", digits = 3, stars = c(0.001, 0.01, 0.05, 0.1),
        custom.model.names = c(
          "Eastern Europe and Central Asia",
          "Latin America and the Caribbean",
          "The Middle East and Northern Africa",
          "Sub-Saharan Africa",
          "Western Europe and North America",
          "Asia and Pacific"),
        custom.note = paste("%stars. Random intercepts logistic regression coefficients with standard errors in parentheses."))
```

### Subsets: Decades

```{r}
    # 1970s
df_analysis_d <- df_analysis %>% filter(year >= 1970 & year <= 1979)
m1_d <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_d)

  # 1980s
df_analysis_d <- df_analysis %>% filter(year >= 1980 & year <= 1989)
m2_d <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_d)

  # 1990s
df_analysis_d <- df_analysis %>% filter(year >= 1990 & year <= 1999)
m3_d <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_d)

  # 2000s
df_analysis_d <- df_analysis %>% filter(year >= 2000 & year <= 2009)
m4_d <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_d)

  # 2010s
df_analysis_d <- df_analysis %>% filter(year >= 2010 & year <= 2019)
m5_d <- glmer(death1 ~ orgext + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis_d)

```

```{r, results = 'asis', echo = FALSE}
htmlreg(l = list(m1_d, m2_d, m3_d, m4_d, m5_d), doctype = FALSE, center = FALSE, caption = "", digits = 3, stars = c(0.001, 0.01, 0.05, 0.1),
        custom.model.names = c("1970s", "1980s", "1990s", "2000s", "2010s"))
```

### Disaggregated dimensions

```{r}
    # orgext disaggregated
m1_i <- glmer(death1 ~ v2palocoff + v2paactcom + v2pasoctie + powercon + v2padisa + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis)

    # powercon disaggregated
m2_i <- glmer(death1 ~ orgext + v2paindrev + v2panom + v2padisa + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis)

    # all disaggregated
m3_i <- glmer(death1 ~ v2palocoff + v2paactcom + v2pasoctie + v2paindrev + v2panom + v2padisa + logcounter1 + (1|election_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis)


    # orgext disaggregated + controls
m4_i <- glmer(death1 ~ v2palocoff + v2paactcom + v2pasoctie + powercon + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis)

    # powercon disaggregated + controls
m5_i <- glmer(death1 ~ orgext + v2paindrev + v2panom + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|country_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis)

    # all disaggregated + controls
m6_i <- glmer(death1 ~ v2palocoff + v2paactcom + v2pasoctie + v2paindrev + v2panom + v2padisa + statefund + logconsecel + v2x_polyarchy + parl + proportional + mixed + logcounter1 + (1|election_id) + (1|v2paid),
            family = binomial(link = "logit"), control = glmerControl(optimizer = "Nelder_Mead"), nAGQ = 0, data = df_analysis)
```

```{r, results = 'asis', echo = FALSE}
  # Output
htmlreg(l = list(m1_i, m2_i, m3_i, m4_i, m5_i, m6_i), doctype = FALSE, center = FALSE, caption = "", digits = 3, stars = c(0.001, 0.01, 0.05, 0.1),
        custom.note = paste("%stars. Random intercepts logistic regression coefficients with standard errors in parentheses."))
```

##  {.unlisted .unnumbered}

</br>

# References

-   Adcock, Robert, and David Collier. 2001. "Measurement Validity: A Shared Standard for Qualitative and Quantitative Research." *American Political Science Review* 95(3): 529--46.
-   Box-Steffensmeier, Janet M., and Bradford S. Jones. 2004. *Event History Modeling: A Guide for Social Scientists.* Cambridge: Cambridge University Press.
-   Comrey, Andrew L., and Howard B. Lee. 1992. *A First Course in Factor Analysis.* 2nd ed. New York: Taylor and Francis.
-   Gan, Xiaoyu, Ignacio C. Fernandez, Jie Guo, Maxwell Wilson, Yuanyuan Zhao, Bingbing Zhou, and Jianguo Wu. 2017. "When to Use What: Methods for Weighting and Aggregating Sustainability Indicators." *Ecological Indicators* 81: 491--502.
-   Giger, Nathalie, and Gijs Schumacher. 2015. *Integrated Party Organization Dataset (IPOD)*. <http://dx.doi.org/10.7910/DVN/PE8TWP>, Harvard Dataverse.
-   Greco, Salvatore, Alessio Ishizaka, Menelaos Tasiou, and Gianpiero Torrisi. 2019. "On the Methodological Framework of Composite Indices: A Review of the Issues of Weighting, Aggregation, and Robustness." *Social Indicators Research* 141 (1): 61--94.
-   Harmel, Robert, and Kenneth Janda. 1994. "An Integrated Theory of Party Goals and Party Change." *Journal of Theoretical Politics* 6 (3): 259--87.
-   Janda, Kenneth. 1980. *Political Parties: A Cross-National Survey*. New York: Free Press.
-   Kahn, Jeffrey H. 2006. "Factor Analysis in Counseling Psychology Research, Training, and Practice." *The Counseling Psychologist* 34 (5): 684--718.
-   Kitschelt, Herbert. 2013. *Democratic Accountability and Linkages Project*. Durham: Duke University.
-   Laver, Michael, and W. B. Hunt. 1992. *Policy and Party Competition*. New York: Routledge.
-   Lührmann, Anna, Tannenberg, Marcus, and Lindberg, Staffan I. 2018. "Regimes of the World (RoW): Opening New Avenues for the Comparative Study of Political Regimes." *Politics and Governance* 6 (1): 60--77.
-   Lührmann, Anna, Nils Düpont, Masaaki Higashijima, Yaman Berker Kavasoglu, Kyle L. Marquardt, Michael Bernhard, Holger Döring, Allen Hicken, Melis Laebens, Staffan I. Lindberg, Juraj Medzihorsky, Anja Neundorf, Ora John Reuter, Saskia Ruth--Lovell, Keith R. Weghorst, Nina Wiesehomeier, Joseph Wright, Nazifa Alizada, Paul Bederke, Lisa Gastaldi, Sandra Grahn, Garry Hindle, Nina Ilchenko, Johannes von Römer, Steven Wilson, Daniel Pemstein, and Brigitte Seim. 2020. "Codebook Varieties of Party Identity and Organisation (V--Party) V1". Varieties of Democracy (V--Dem) Project.
-   McMann, Kelly, Daniel Pemstein, Brigitte Seim, Jan Teorell, and Staffan Lindberg. 2021. "Assessing Data Quality: An Approach and An Application." *Political Analysis*: 1–24 (Online first).
-   OECD. 2008. *Handbook on Constructing Composite Indicators: Methodology and User Guide.* Paris: OECD.
-   Pemstein, Daniel, Kyle L. Marquardt, Eitan Tzelgov, Yi-ting Wang, Juraj Medzihorsky, Joshua Krusell, Farhad Miri, and Johannes von Römer. 2019. *The V-Dem Measurement Model: Latent Variable Analysis for Cross-National and Cross-Temporal Expert-Coded Data.* Working Paper Series 2019:21, 4th ed. University of Gothenburg, Varieties of Democracy Institute (V-Dem).
-   Poguntke, Thomas, Susan E. Scarrow, and Paul D. Webb. 2020. *PPDB_Round1a_1b_consolidated_v1*. <https://doi.org/10.7910/DVN/NBWDFZ>, Harvard Dataverse.
-   Revelle, William, and Thomas Rocklin. 1979. "Very Simple Structure: An Alternative Procedure For Estimating The Optimal Number Of Interpretable Factors." *Multivariate Behavioral Research* 14 (4): 403--14.
-   Rohrschneider, Robert, and Stephen Whitefield. 2012. *The Strain of Representation: How Parties Represent Diverse Voters in Western and Eastern Europe*. Oxford: Oxford University Press.
-   Schumacher, Gijs, and Nathalie Giger. 2017. "Who Leads the Party? On Membership Size, Selectorates and Party Oligarchy." *Political Studies* 65 (1S): 162--81.
-   Tavits, Margit. 2011. "Party Organizational Strength and Party Unity in Post-communist Europe." *European Political Science Review*: 1--23.
